{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "无通信的多智能体就是多组单智能体而已.分别实现即可.\n",
    "\n",
    "每一组智能体有各自的actor,critic,并且不通信任何的数据."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAR8AAAEYCAYAAABlUvL1AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjcuMSwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/bCgiHAAAACXBIWXMAAA9hAAAPYQGoP6dpAAAloElEQVR4nO3dfXRU5Z0H8O+8ZIZJwsyQhMwkkmhYwZACgkTDCC2upKQQXZHUFxotcigecaBgLKfmHEClrWFpd3VZNdTuWUm3RWz2iArLizkBAy5DgCgaAgas2ERgJmCcmSSSSTL32T9s7jISQ4a8PDPy/ZxzzzH3eSbzu9fcL8991wghBIiIhphWdgFEdG1i+BCRFAwfIpKC4UNEUjB8iEgKhg8RScHwISIpGD5EJAXDh4ikYPgQkRTSwuell17CDTfcgGHDhiEnJweHDh2SVQoRSSAlfF5//XUUFRXh6aefxvvvv4+bb74ZeXl5aGpqklEOEUmgkXFjaU5ODm699Va8+OKLAABFUZCWloZly5bhqaeeGupyiEgC/VB/YUdHB2pqalBcXKzO02q1yM3Nhcvl6vEzgUAAgUBA/VlRFDQ3NyMxMREajWbQayaivhFCoKWlBampqdBqe9+xGvLwuXDhAoLBIGw2W8h8m82Gjz/+uMfPlJSU4Nlnnx2K8ohoADQ2NmLUqFG99hny8LkaxcXFKCoqUn/2+XxIT09HY2MjzGazxMqI6FJ+vx9paWkYPnz4FfsOefgkJSVBp9PB4/GEzPd4PLDb7T1+xmg0wmg0XjbfbDYzfIgiUF8Ohwz52S6DwYApU6agsrJSnacoCiorK+FwOIa6HCKSRMpuV1FRERYsWIDs7GzcdttteOGFF9DW1oaFCxfKKIeIJJASPg888ADOnz+PNWvWwO12Y9KkSdi1a9dlB6GJ6LtLynU+/eX3+2GxWODz+XjMhyiChLNt8t4uIpKC4UNEUjB8iEgKhg8RScHwISIpGD5EJAXDh4ikYPgQkRQMHyKSguFDRFIwfIhICoYPEUnB8CEiKRg+RCQFw4eIpGD4EJEUDB8ikoLhQ0RSMHyISAqGDxFJwfAhIikYPkQkRdjhs2/fPtx9991ITU2FRqPBm2++GdIuhMCaNWuQkpICk8mE3NxcnDp1KqRPc3MzCgsLYTabYbVasWjRIrS2tvZrQYgouoQdPm1tbbj55pvx0ksv9di+fv16bNiwARs3bkR1dTXi4uKQl5eH9vZ2tU9hYSHq6upQUVGB7du3Y9++fXj00UevfimIKPqIfgAgtm7dqv6sKIqw2+3it7/9rTrP6/UKo9EoXnvtNSGEEMePHxcAxOHDh9U+O3fuFBqNRpw5c6ZP3+vz+QQA4fP5+lM+EQ2wcLbNAT3mc/r0abjdbuTm5qrzLBYLcnJy4HK5AAAulwtWqxXZ2dlqn9zcXGi1WlRXV/f4ewOBAPx+f8hERNFtQMPH7XYDwGXvXLfZbGqb2+1GcnJySLter0dCQoLa55tKSkpgsVjUKS0tbSDLJiIJouJsV3FxMXw+nzo1NjbKLomI+mlAw8dutwMAPB5PyHyPx6O22e12NDU1hbR3dXWhublZ7fNNRqMRZrM5ZCKi6Dag4ZORkQG73Y7Kykp1nt/vR3V1NRwOBwDA4XDA6/WipqZG7bNnzx4oioKcnJyBLIeIIpg+3A+0trbik08+UX8+ffo0jh49ioSEBKSnp2PFihX49a9/jTFjxiAjIwOrV69Gamoq5s6dCwAYN24cfvSjH2Hx4sXYuHEjOjs7sXTpUjz44INITU0dsAUjoggX7qm0vXv3CgCXTQsWLBBCfH26ffXq1cJmswmj0Shmzpwp6uvrQ37HF198IebPny/i4+OF2WwWCxcuFC0tLX2ugafaiSJTONumRgghJGbfVfH7/bBYLPD5fDz+QxRBwtk2o+JsFxF99zB8iEgKhg8RScHwISIpGD5EJAXDh4ikYPgQkRQMHyKSguFDRFIwfIhICoYPEUnB8CEiKRg+RCQFw4eIpGD4EJEUDB8ikoLhQ0RSMHyISAqGDxFJwfAhIikYPkQkBcOHiKQIK3xKSkpw6623Yvjw4UhOTsbcuXNRX18f0qe9vR1OpxOJiYmIj49HQUHBZa9PbmhoQH5+PmJjY5GcnIyVK1eiq6ur/0tDRFEjrPCpqqqC0+nEwYMHUVFRgc7OTsyaNQttbW1qnyeeeALbtm1DeXk5qqqqcPbsWcybN09tDwaDyM/PR0dHBw4cOICysjJs2rQJa9asGbilIqLI15+3EzY1NQkAoqqqSgghhNfrFTExMaK8vFztc+LECQFAuFwuIYQQO3bsEFqtVrjdbrVPaWmpMJvNIhAI9Ol7+cZSosgUzrbZr2M+Pp8PAJCQkAAAqKmpQWdnJ3Jzc9U+mZmZSE9Ph8vlAgC4XC5MmDABNptN7ZOXlwe/34+6uroevycQCMDv94dMRBTdrjp8FEXBihUrMG3aNIwfPx4A4Ha7YTAYYLVaQ/rabDa43W61z6XB093e3daTkpISWCwWdUpLS7vasokoQlx1+DidThw7dgxbtmwZyHp6VFxcDJ/Pp06NjY2D/p1ENLj0V/OhpUuXYvv27di3bx9GjRqlzrfb7ejo6IDX6w0Z/Xg8HtjtdrXPoUOHQn5f99mw7j7fZDQaYTQar6ZUIopQYY18hBBYunQptm7dij179iAjIyOkfcqUKYiJiUFlZaU6r76+Hg0NDXA4HAAAh8OB2tpaNDU1qX0qKipgNpuRlZXVn2UhoigS1sjH6XRi8+bNeOuttzB8+HD1GI3FYoHJZILFYsGiRYtQVFSEhIQEmM1mLFu2DA6HA1OnTgUAzJo1C1lZWXj44Yexfv16uN1urFq1Ck6nk6MbomtJOKfRAPQ4vfrqq2qfixcviscff1yMGDFCxMbGinvvvVecO3cu5Pd89tlnYvbs2cJkMomkpCTx5JNPis7Ozj7XwVPtRJEpnG1TI4QQ8qLv6vj9flgsFvh8PpjNZtnlENHfhbNt8t4uIpKC4UNEUjB8iEgKhg8RScHwISIpGD5EJAXDh4ikYPgQkRQMHyKSguFDRFIwfIhICoYPEUnB8CEiKRg+RCQFw4eIpGD4EJEUDB8ikoLhQ0RSMHyISAqGDxFJwfAhIikYPkQkRVjhU1paiokTJ8JsNsNsNsPhcGDnzp1qe3t7O5xOJxITExEfH4+CggL1VcjdGhoakJ+fj9jYWCQnJ2PlypXo6uoamKUhoqgRVviMGjUK69atQ01NDY4cOYI777wT99xzD+rq6gAATzzxBLZt24by8nJUVVXh7NmzmDdvnvr5YDCI/Px8dHR04MCBAygrK8OmTZuwZs2agV0qIop8/X1D4YgRI8R//Md/CK/XK2JiYkR5ebnaduLECQFAuFwuIYQQO3bsEFqtVrjdbrVPaWmpMJvNIhAI9Pk7+cZSosgUzrZ51cd8gsEgtmzZgra2NjgcDtTU1KCzsxO5ublqn8zMTKSnp8PlcgEAXC4XJkyYAJvNpvbJy8uD3+9XR089CQQC8Pv9IRMRRbeww6e2thbx8fEwGo147LHHsHXrVmRlZcHtdsNgMMBqtYb0t9lscLvdAAC32x0SPN3t3W3fpqSkBBaLRZ3S0tLCLZuIIkzY4XPTTTfh6NGjqK6uxpIlS7BgwQIcP358MGpTFRcXw+fzqVNjY+Ogfh8RDT59uB8wGAy48cYbAQBTpkzB4cOH8W//9m944IEH0NHRAa/XGzL68Xg8sNvtAAC73Y5Dhw6F/L7us2HdfXpiNBphNBrDLZWIIli/r/NRFAWBQABTpkxBTEwMKisr1bb6+no0NDTA4XAAABwOB2pra9HU1KT2qaiogNlsRlZWVn9LIaIoEtbIp7i4GLNnz0Z6ejpaWlqwefNmvPvuu9i9ezcsFgsWLVqEoqIiJCQkwGw2Y9myZXA4HJg6dSoAYNasWcjKysLDDz+M9evXw+12Y9WqVXA6nRzZEF1jwgqfpqYm/PSnP8W5c+dgsVgwceJE7N69Gz/84Q8BAM8//zy0Wi0KCgoQCASQl5eHl19+Wf28TqfD9u3bsWTJEjgcDsTFxWHBggVYu3btwC4VEUU8jRBCyC4iXH6/HxaLBT6fD2azWXY5RPR34WybvLeLiKRg+BCRFAwfIpKC4UNEUjB8iEgKhg8RScHwISIpGD5EJAXDh4ikYPgQkRQMHyKSguFDRFIwfIhICoYPEUnB8CEiKRg+RCQFw4eIpGD4EJEUDB8ikoLhQ0RSMHyISAqGDxFJ0a/wWbduHTQaDVasWKHOa29vh9PpRGJiIuLj41FQUKC+ErlbQ0MD8vPzERsbi+TkZKxcuRJdXV39KYWIosxVh8/hw4fx+9//HhMnTgyZ/8QTT2Dbtm0oLy9HVVUVzp49i3nz5qntwWAQ+fn56OjowIEDB1BWVoZNmzZhzZo1V78URBR9xFVoaWkRY8aMERUVFWLGjBli+fLlQgghvF6viImJEeXl5WrfEydOCADC5XIJIYTYsWOH0Gq1wu12q31KS0uF2WwWgUCgT9/v8/kEAOHz+a6mfCIaJOFsm1c18nE6ncjPz0dubm7I/JqaGnR2dobMz8zMRHp6OlwuFwDA5XJhwoQJsNlsap+8vDz4/X7U1dX1+H2BQAB+vz9kIqLoFta72gFgy5YteP/993H48OHL2txuNwwGA6xWa8h8m80Gt9ut9rk0eLrbu9t6UlJSgmeffTbcUokogoU18mlsbMTy5cvx5z//GcOGDRusmi5TXFwMn8+nTo2NjUP23UQ0OMIKn5qaGjQ1NeGWW26BXq+HXq9HVVUVNmzYAL1eD5vNho6ODni93pDPeTwe2O12AIDdbr/s7Ff3z919vsloNMJsNodMRBTdwgqfmTNnora2FkePHlWn7OxsFBYWqv8dExODyspK9TP19fVoaGiAw+EAADgcDtTW1qKpqUntU1FRAbPZjKysrAFaLCKKdGEd8xk+fDjGjx8fMi8uLg6JiYnq/EWLFqGoqAgJCQkwm81YtmwZHA4Hpk6dCgCYNWsWsrKy8PDDD2P9+vVwu91YtWoVnE4njEbjAC0WEUW6sA84X8nzzz8PrVaLgoICBAIB5OXl4eWXX1bbdTodtm/fjiVLlsDhcCAuLg4LFizA2rVrB7oUIopgGiGEkF1EuPx+PywWC3w+H4//EEWQcLZN3ttFRFIwfIhICoYPEUnB8CEiKRg+RCQFw4eIpBjw63yIBpOiKPj8zOfwtfig1+oxOmM0L06NUgwfigpCCJz+22lseXcL9l3Yh3ZdOzTQYKx2LObfNh/TsqchJiZGdpkUBl5kSBFPCIGaYzX41Z5f4aukr6CJ0UCj0XzdpggEm4OYZ5uHxXcvhiHGILnaaxsvMqTvlPMXzuO5d57DxZSL0Bq0avAAgEargS5RhzcuvIFd+3ZJrJLCxfChiCaEwFv73oIv0RcSOpfSaDTQWXX4y9G/oLWtdYgrpKvF8KGI9tVXX2HXJ7ugi9X12k+j0eBM7Bkc+fDIEFVG/cXwoYimKArag+3fOuq5lNAJXGy/OARV0UBg+FBE0+l0iNPHoS/nRTRdGgyPGz4EVdFAYPhQRDOZTLg7824EW4O99hNCICOQgSkTpwxRZdRfDB+KaBqNBvnfz8dI30gIpefRjxAColngwewHYTKZhrhCuloMH4p4I6wjsPqu1bCcsyB4MRiyC6YEFQiPwEPpD+FOx50Sq6Rw8SJDigpCCLg9brxR9QZ2N+xGK1qhhRaT4iZh/vT5mPS9SdDpej8jRoMvnG2T4UNRRQiBC19cQFtbG3RaHVJSUqDXR/5dQsFgEK2trWhvb0ddXR0URVHbNBoNxo0bh7i4OMTHx0d1iIazbUb+/zWiS2g0GoxMGomRSSNll3JFgUAAf/vb37Bv3z588MEHOHbsGLq6utDS0nJZ3/j4eOj1emRlZWHSpEmYMWMGMjIyhvTlnEONIx+iASSEwJdffolt27ahoqICJ0+eREdHB7Tarw+v9na9UvemqCgKYmJicOONN+LOO+/E3LlzkZSU1KdrnWTjbheRBK2trfjv//5vvP766zh37hy0Wm2/AkMIAUVRkJycjB//+Md44IEHYLFYBrDigTdoN5Y+88wz0Gg0IVNmZqba3t7eDqfTicTERMTHx6OgoOCyVyM3NDQgPz8fsbGxSE5OxsqVK9HV1RVOGUQRRVEU1NXVwel04oUXXoDH44FOp+v3SEWj0UCn0+HChQt4+eWXsWTJErz//vsIBnu/5ilahH2q/Xvf+x7OnTunTu+9957a9sQTT2Dbtm0oLy9HVVUVzp49i3nz5qntwWAQ+fn56OjowIEDB1BWVoZNmzZhzZo1A7M0REMsEAigvLwcixcvRm1t7YCEzjd1h9DHH3+Mxx9/HH/605/Q3t4+oN8hQ1i7Xc888wzefPNNHD169LI2n8+HkSNHYvPmzfjxj38MAPj4448xbtw4uFwuTJ06FTt37sRdd92Fs2fPwmazAQA2btyIX/7ylzh//jwMhr49i4W7XRQJ2tra8Ktf/QoVFRUQQgzJMZnu75k+fTrWrl0bcbthg/o8n1OnTiE1NRWjR49GYWEhGhoaAAA1NTXo7OxEbm6u2jczMxPp6elwuVwAAJfLhQkTJqjBAwB5eXnw+/2oq6v71u8MBALw+/0hE5FMra2tWLt2Ld555x0AvR9IHkjd37N//36sXr0aPp9vSL53MIQVPjk5Odi0aRN27dqF0tJSnD59Gt///vfR0tICt9sNg8EAq9Ua8hmbzQa32w0AcLvdIcHT3d7d9m1KSkpgsVjUKS0tLZyyiQZUd/BUVFRIOwOl0Wjw3nvvYdWqVVEbQGGFz+zZs3Hfffdh4sSJyMvLw44dO+D1evGXv/xlsOoDABQXF8Pn86lTY2PjoH4f0bdRFAWvv/463nnnHemnvrsDqKysLCoPQvfr3i6r1YqxY8fik08+gd1uR0dHB7xeb0gfj8cDu90OALDb7Zed/er+ubtPT4xGI8xmc8hENNSEEDhw4AB+//vfq9ftyKbVavFf//VfqKys7NNjRyJJv9Zga2sr/vrXvyIlJQVTpkxBTEwMKisr1fb6+no0NDTA4XAAABwOB2pra9HU1KT2qaiogNlsRlZWVn9KIRp0Xq8Xv/vd79DV1SV91HOpYDCI559/HufPn5ddSljCCp9f/OIXqKqqwmeffYYDBw7g3nvvhU6nw/z582GxWLBo0SIUFRVh7969qKmpwcKFC+FwODB16lQAwKxZs5CVlYWHH34YH374IXbv3o1Vq1bB6XTy3UsU0RRFQWlpKRoaGiIqeICvd788Hg9eeOGFqNr9Cit8Pv/8c8yfPx833XQT7r//fiQmJuLgwYMYOfLr+2yef/553HXXXSgoKMAPfvAD2O12vPHGG+rndTodtm/fDp1OB4fDgYceegg//elPsXbt2oFdKqIBdvr0afzP//yP7DJ6VVlZiY8//lh2GX3G2yuIrkBRFPz617/G1q1bI+ZYT08URcGsWbPw3HPPSbsznu/tIhpAzc3N2LNnT8Ttbn2TRqPB/v37e71sJZIwfIiu4MiRI/jyyy+jInza2trUi3ojHcOHqBeKoqCioiJqHvCl0+lQUVERFQeeGT5EvWhra8MHH3wgu4yw1NXVXXa9XSRi+BD14uTJk/B6vRG/y9Wte9ert3slIwXDh6gXLS0tUfe8qWAwGBU3XzN8iHpRXV0dNcd7uul0OlRXV8su44oYPkS98Pl8UbPL1U2j0UTFne4MHyKSguFDRFIwfIhICoYPUS/Gjh0b8nbRaKAoCsaOHSu7jCti+BD1YvTo0VEZPqNHj5ZdxhUxfIh6cd111yE2NlZ2GWExGAxIT0+XXcYVMXyIepGSkoLk5OSoeUSpEAIJCQkMH6JoZzKZMH369KgKn5ycHAwfPlx2KVfE8CHqhUajwcyZM6PqQsPc3NyoqJfhQ3QF48aNw6hRoyJ+9COEwMiRIzFp0iTZpfQJw4foCkwmE+6///6IP+ulKArmzZsXNY8WZvgQXYFGo0F+fn5Ej366Rz3z5s2Lil0ugOFD1CdWqxU/+9nPIvYB8hqNBo888oj6JploEJlrkijCaDQazJkzB3fccUfE7X4pioKpU6dG1agHuIrwOXPmDB566CEkJibCZDJhwoQJOHLkiNouhMCaNWuQkpICk8mE3NxcnDp1KuR3NDc3o7CwEGazGVarFYsWLUJra2v/l4ZoEBkMBixbtgzXXXddxOx+CSGQnJyM5cuXw2QyyS4nLGGFz5dffolp06YhJiYGO3fuxPHjx/Ev//IvGDFihNpn/fr12LBhAzZu3Ijq6mrExcUhLy8P7e3tap/CwkLU1dWhoqIC27dvx759+/Doo48O3FIRDZL09HQ89dRT0Ov1ERFAOp0OTz75JMaMGSO7lLCF9dLAp556Cv/7v/+L/fv399guhEBqaiqefPJJ/OIXvwDw9cOYbDYbNm3ahAcffBAnTpxAVlYWDh8+jOzsbADArl27MGfOHHz++edITU29Yh18aSDJpCgK3njjDfzud79DR0eHtF0dnU6HZcuW4Sc/+UnEPG1x0F4a+PbbbyM7Oxv33XcfkpOTMXnyZPzhD39Q20+fPg23243c3Fx1nsViQU5OjvouIZfLBavVqgYP8PVFUVqt9lsf/RgIBOD3+0MmIlm0Wi3mzZuHlStXwmAwDPkISAgRkcETrrDC59NPP0VpaSnGjBmD3bt3Y8mSJfj5z3+OsrIyAFDflGiz2UI+Z7PZ1Da3243k5OSQdr1ej4SEhG9902JJSQksFos6paWlhVM20YDTarW49957UVxcjKSkpCELICEERowYgaKioqgOHiDM8FEUBbfccguee+45TJ48GY8++igWL16MjRs3DlZ9AIDi4mL4fD51amxsHNTvI+oLrVaLe+65By+++CImTpyIYDA4aCEkhEAwGERmZiZeeOEFPPDAA1EdPECY4ZOSkoKsrKyQeePGjUNDQwMAwG63AwA8Hk9IH4/Ho7bZ7XY0NTWFtHd1daG5uVnt801GoxFmszlkIooUY8eOxYsvvgin04lhw4ZBUZQBCyEhBBRFgcFgwM9+9jOUlpZiwoQJUXVK/duEFT7Tpk1DfX19yLyTJ0/i+uuvBwBkZGTAbrejsrJSbff7/aiurobD4QAAOBwOeL1e1NTUqH327NkDRVGQk5Nz1QtCJFN8fDwWLVqEP/7xj7jnnnswbNiwfo2Eukc6BoMBs2fPxquvvoolS5bAYrEMcOXyhHW26/Dhw7j99tvx7LPP4v7778ehQ4ewePFivPLKKygsLAQA/PM//zPWrVuHsrIyZGRkYPXq1fjoo49w/PhxDBs2DAAwe/ZseDwebNy4EZ2dnVi4cCGys7OxefPmPtXBs10UyRRFwaeffop9+/Zh7969OHnyJC5evAi9Xq/2uXTkcukmGAwGYTQa8Q//8A+44447cMcdd+DGG2+Mml2scLbNsMIHALZv347i4mKcOnUKGRkZKCoqwuLFi9V2IQSefvppvPLKK/B6vZg+fTpefvnlkGfKNjc3Y+nSpdi2bRu0Wi0KCgqwYcMGxMfHD/gCEskUCATQ0NCA48eP48CBA2htbUVdXV3IVdIajQZZWVmIj4+Hw+HAuHHjkJGRof5jHU0GNXwiAcOHolUwGERbW1vIaEej0SAuLi5qRje9CWfb1PfaSkQDSqfT8R/Mv+ONpUQkBcOHiKRg+BCRFAwfIpKCB5z7QVEUeL1eeP5+T5per0f69dfDYDB8J65AJRpMDJ+r0NnRgZoDB/BRZSW8H32EGJ8PACB0OgRTU5F2++2YNmcORqWnM4SIvgWv8wmT98svUfHHP+L8229jhFYL/Tee6SuEQEBR0Gw2Y1pRESZNnRpyZSvRd9mgPc/nWvfF+fN4rbgYbdu2YaRef1nwAF9fMDZMp0NKayuqn3kG72zejGAwKKFaosjG8Omjzo4O7C4rQ9ypU4jrw5WoGo0GI7VafPLaa6j/6KOIeOQmUSRh+PTR/t270bpzJ4aFcQm8RqOBPRjEzpISeL/8chCrI4o+DJ8+6OrqQu3u3bDqdGEfQNZoNIg7fx7HLnmECBExfPrEfe4cOk6evOrPW2NiULd3b8S974lIJoZPH/z1xAmY+/GWAi2A5hMnEAgEBrYwoijG8OmD/h4s1mg0AA84E4Vg+PSBVqdDf6JDCAHNd+BZLUQDieHTBzeNHw+/yXTVI6CgEEi6+WYYjcYBrowoejF8+iBp5EiYxo276tFPc1cXJvzjP0Lbw0WJRNcqbg19oNPpkH333WgWIuzRjyIEOtPT8b1JkwanOKIoxfDpo9t+8APY7rsPbWHcKiGEwFmDAXNXr+7zw/GJrhUMnz7S6XT44YMPQsnOhq8P72NShMBZrRZTHnsM6aNH8+52om9g+IRhuNmMnzzzDK575BGcA9De1XVZH0UI+Do74U5Oxpx//VdMnzOHx3qIesBnPYTJZDJh1vz5uGHSJBx991387eBBBP/+MDHo9TBlZuKmGTNw64wZGJGQwBEP0bfg83z6QQiB9vZ2eL1eAF/vmiUlJUGj0TB06JrE93YNEY1GA5PJBJPJJLsUoqgTleHTPVjz+/2SKyGiS3Vvk33ZoYrK8Pniiy8AAGlpaZIrIaKetLS0wGKx9NonKsMnISEBANDQ0HDFBfwu8/v9SEtLQ2Nj4zX9Cl6uh69FwnoQQqClpQWpqalX7BuV4dN96tpisVzTf2zdzGYz1wO4HrrJXg99HRDwAhQikoLhQ0RSRGX4GI1GPP3009f8Iyq4Hr7G9fC1aFsPUXmRIRFFv6gc+RBR9GP4EJEUDB8ikoLhQ0RSRGX4vPTSS7jhhhswbNgw5OTk4NChQ7JLGjAlJSW49dZbMXz4cCQnJ2Pu3Lmor68P6dPe3g6n04nExETEx8ejoKAAHo8npE9DQwPy8/MRGxuL5ORkrFy5El09PH8oWqxbtw4ajQYrVqxQ510r6+HMmTN46KGHkJiYCJPJhAkTJuDIkSNquxACa9asQUpKCkwmE3Jzc3Hq1KmQ39Hc3IzCwkKYzWZYrVYsWrQIra2tQ70ooUSU2bJlizAYDOI///M/RV1dnVi8eLGwWq3C4/HILm1A5OXliVdffVUcO3ZMHD16VMyZM0ekp6eL1tZWtc9jjz0m0tLSRGVlpThy5IiYOnWquP3229X2rq4uMX78eJGbmys++OADsWPHDpGUlCSKi4tlLFK/HTp0SNxwww1i4sSJYvny5er8a2E9NDc3i+uvv1488sgjorq6Wnz66adi9+7d4pNPPlH7rFu3TlgsFvHmm2+KDz/8UPzTP/2TyMjIEBcvXlT7/OhHPxI333yzOHjwoNi/f7+48cYbxfz582Uskirqwue2224TTqdT/TkYDIrU1FRRUlIisarB09TUJACIqqoqIYQQXq9XxMTEiPLycrXPiRMnBADhcrmEEELs2LFDaLVa4Xa71T6lpaXCbDaLQCAwtAvQTy0tLWLMmDGioqJCzJgxQw2fa2U9/PKXvxTTp0//1nZFUYTdbhe//e1v1Xler1cYjUbx2muvCSGEOH78uAAgDh8+rPbZuXOn0Gg04syZM4NX/BVE1W5XR0cHampqkJubq87TarXIzc2Fy+WSWNng8fl8AP7/Ztqamhp0dnaGrIPMzEykp6er68DlcmHChAmw2Wxqn7y8PPj9ftTV1Q1h9f3ndDqRn58fsrzAtbMe3n77bWRnZ+O+++5DcnIyJk+ejD/84Q9q++nTp+F2u0PWg8ViQU5OTsh6sFqtyM7OVvvk5uZCq9Wiurp66BbmG6IqfC5cuIBgMBjyxwQANpsN7u5HmX6HKIqCFStWYNq0aRg/fjwAwO12w2AwwGq1hvS9dB243e4e11F3W7TYsmUL3n//fZSUlFzWdq2sh08//RSlpaUYM2YMdu/ejSVLluDnP/85ysrKAPz/cvS2TbjdbiQnJ4e06/V6JCQkSF0PUXlX+7XC6XTi2LFjeO+992SXMuQaGxuxfPlyVFRUYNiwYbLLkUZRFGRnZ+O5554DAEyePBnHjh3Dxo0bsWDBAsnV9U9UjXySkpKg0+kuO6Ph8Xhgt9slVTU4li5diu3bt2Pv3r0YNWqUOt9ut6Ojo0N9bnS3S9eB3W7vcR11t0WDmpoaNDU14ZZbboFer4der0dVVRU2bNgAvV4Pm812TayHlJQUZGVlhcwbN24cGhoaAPz/cvS2TdjtdjQ1NYW0d3V1obm5Wep6iKrwMRgMmDJlCiorK9V5iqKgsrISDodDYmUDRwiBpUuXYuvWrdizZw8yMjJC2qdMmYKYmJiQdVBfX4+GhgZ1HTgcDtTW1ob8wVVUVMBsNl/2hxypZs6cidraWhw9elSdsrOzUVhYqP73tbAepk2bdtmlFidPnsT1118PAMjIyIDdbg9ZD36/H9XV1SHrwev1oqamRu2zZ88eKIqCnJycIViKbyHtUPdV2rJlizAajWLTpk3i+PHj4tFHHxVWqzXkjEY0W7JkibBYLOLdd98V586dU6evvvpK7fPYY4+J9PR0sWfPHnHkyBHhcDiEw+FQ27tPMc+aNUscPXpU7Nq1S4wcOTKqTjH35NKzXUJcG+vh0KFDQq/Xi9/85jfi1KlT4s9//rOIjY0Vf/rTn9Q+69atE1arVbz11lvio48+Evfcc0+Pp9onT54sqqurxXvvvSfGjBnDU+1X49///d9Fenq6MBgM4rbbbhMHDx6UXdKAAdDj9Oqrr6p9Ll68KB5//HExYsQIERsbK+69915x7ty5kN/z2WefidmzZwuTySSSkpLEk08+KTo7O4d4aQbWN8PnWlkP27ZtE+PHjxdGo1FkZmaKV155JaRdURSxevVqYbPZhNFoFDNnzhT19fUhfb744gsxf/58ER8fL8xms1i4cKFoaWkZysW4DB+pQURSRNUxHyL67mD4EJEUDB8ikoLhQ0RSMHyISAqGDxFJwfAhIikYPkQkBcOHiKRg+BCRFAwfIpKC4UNEUvwfJN3Mp9jV8mcAAAAASUVORK5CYII=",
      "text/plain": [
       "<Figure size 300x300 with 1 Axes>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "import gym\n",
    "\n",
    "\n",
    "#定义环境\n",
    "class MyWrapper(gym.Wrapper):\n",
    "\n",
    "    def __init__(self):\n",
    "        from pettingzoo.mpe import simple_tag_v3\n",
    "        env = simple_tag_v3.env(num_good=1,\n",
    "                                num_adversaries=1,\n",
    "                                num_obstacles=1,\n",
    "                                max_cycles=1e8,\n",
    "                                render_mode='rgb_array')\n",
    "        super().__init__(env)\n",
    "        self.env = env\n",
    "        self.step_n = 0\n",
    "\n",
    "    def reset(self):\n",
    "        self.env.reset()\n",
    "        self.step_n = 0\n",
    "        return self.state()\n",
    "\n",
    "    def state(self):\n",
    "        state = []\n",
    "        for i in self.env.agents:\n",
    "            state.append(env.observe(i).tolist())\n",
    "        state[-1].extend([0.0, 0.0])\n",
    "        return state\n",
    "\n",
    "    def step(self, action):\n",
    "        reward_sum = [0, 0]\n",
    "        for i in range(5):\n",
    "            if i != 0:\n",
    "                action = [-1, -1]\n",
    "            next_state, reward, over = self._step(action)\n",
    "            for j in range(2):\n",
    "                reward_sum[j] += reward[j]\n",
    "            self.step_n -= 1\n",
    "\n",
    "        self.step_n += 1\n",
    "\n",
    "        return next_state, reward_sum, over\n",
    "\n",
    "    def _step(self, action):\n",
    "        for i, _ in enumerate(env.agent_iter(2)):\n",
    "            self.env.step(action[i] + 1)\n",
    "\n",
    "        reward = [self.env.rewards[i] for i in self.env.agents]\n",
    "\n",
    "        _, _, termination, truncation, _ = env.last()\n",
    "        over = termination or truncation\n",
    "\n",
    "        #限制最大步数\n",
    "        self.step_n += 1\n",
    "        if self.step_n >= 100:\n",
    "            over = True\n",
    "\n",
    "        return self.state(), reward, over\n",
    "\n",
    "    #打印游戏图像\n",
    "    def show(self):\n",
    "        from matplotlib import pyplot as plt\n",
    "        plt.figure(figsize=(3, 3))\n",
    "        plt.imshow(self.env.render())\n",
    "        plt.show()\n",
    "\n",
    "\n",
    "env = MyWrapper()\n",
    "env.reset()\n",
    "\n",
    "env.show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[<__main__.A2C at 0x26407b4e770>, <__main__.A2C at 0x26407b4e440>]"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import torch\n",
    "\n",
    "\n",
    "class A2C:\n",
    "\n",
    "    def __init__(self, model_actor, model_critic, model_critic_delay,\n",
    "                 optimizer_actor, optimizer_critic):\n",
    "        self.model_actor = model_actor\n",
    "        self.model_critic = model_critic\n",
    "        self.model_critic_delay = model_critic_delay\n",
    "        self.optimizer_actor = optimizer_actor\n",
    "        self.optimizer_critic = optimizer_critic\n",
    "\n",
    "        self.model_critic_delay.load_state_dict(self.model_critic.state_dict())\n",
    "        self.requires_grad(self.model_critic_delay, False)\n",
    "\n",
    "    def soft_update(self, _from, _to):\n",
    "        for _from, _to in zip(_from.parameters(), _to.parameters()):\n",
    "            value = _to.data * 0.99 + _from.data * 0.01\n",
    "            _to.data.copy_(value)\n",
    "\n",
    "    def requires_grad(self, model, value):\n",
    "        for param in model.parameters():\n",
    "            param.requires_grad_(value)\n",
    "\n",
    "    def train_critic(self, state, reward, next_state, over):\n",
    "        self.requires_grad(self.model_critic, True)\n",
    "        self.requires_grad(self.model_actor, False)\n",
    "\n",
    "        #计算values和targets\n",
    "        value = self.model_critic(state)\n",
    "\n",
    "        with torch.no_grad():\n",
    "            target = self.model_critic_delay(next_state)\n",
    "        target = target * 0.99 * (1 - over) + reward\n",
    "\n",
    "        #时序差分误差,也就是tdloss\n",
    "        loss = torch.nn.functional.mse_loss(value, target)\n",
    "\n",
    "        loss.backward()\n",
    "        self.optimizer_critic.step()\n",
    "        self.optimizer_critic.zero_grad()\n",
    "        self.soft_update(self.model_critic, self.model_critic_delay)\n",
    "\n",
    "        #减去value相当于去基线\n",
    "        return (target - value).detach()\n",
    "\n",
    "    def train_actor(self, state, action, value):\n",
    "        self.requires_grad(self.model_critic, False)\n",
    "        self.requires_grad(self.model_actor, True)\n",
    "\n",
    "        #重新计算动作的概率\n",
    "        prob = self.model_actor(state)\n",
    "        prob = prob.gather(dim=1, index=action)\n",
    "\n",
    "        #根据策略梯度算法的导函数实现\n",
    "        #函数中的Q(state,action),这里使用critic模型估算\n",
    "        prob = (prob + 1e-8).log() * value\n",
    "        loss = -prob.mean()\n",
    "\n",
    "        loss.backward()\n",
    "        self.optimizer_actor.step()\n",
    "        self.optimizer_actor.zero_grad()\n",
    "\n",
    "        return loss.item()\n",
    "\n",
    "\n",
    "model_actor = [\n",
    "    torch.nn.Sequential(\n",
    "        torch.nn.Linear(10, 64),\n",
    "        torch.nn.ReLU(),\n",
    "        torch.nn.Linear(64, 64),\n",
    "        torch.nn.ReLU(),\n",
    "        torch.nn.Linear(64, 4),\n",
    "        torch.nn.Softmax(dim=1),\n",
    "    ) for _ in range(2)\n",
    "]\n",
    "\n",
    "model_critic = [\n",
    "    torch.nn.Sequential(\n",
    "        torch.nn.Linear(10, 64),\n",
    "        torch.nn.ReLU(),\n",
    "        torch.nn.Linear(64, 64),\n",
    "        torch.nn.ReLU(),\n",
    "        torch.nn.Linear(64, 1),\n",
    "    ) for _ in range(2)\n",
    "]\n",
    "\n",
    "model_critic_delay = [\n",
    "    torch.nn.Sequential(\n",
    "        torch.nn.Linear(10, 64),\n",
    "        torch.nn.ReLU(),\n",
    "        torch.nn.Linear(64, 64),\n",
    "        torch.nn.ReLU(),\n",
    "        torch.nn.Linear(64, 1),\n",
    "    ) for _ in range(2)\n",
    "]\n",
    "\n",
    "optimizer_actor = [\n",
    "    torch.optim.Adam(model_actor[i].parameters(), lr=1e-3) for i in range(2)\n",
    "]\n",
    "\n",
    "optimizer_critic = [\n",
    "    torch.optim.Adam(model_critic[i].parameters(), lr=5e-3) for i in range(2)\n",
    "]\n",
    "\n",
    "a2c = [\n",
    "    A2C(model_actor[i], model_critic[i], model_critic_delay[i],\n",
    "        optimizer_actor[i], optimizer_critic[i]) for i in range(2)\n",
    "]\n",
    "\n",
    "model_actor = None\n",
    "model_critic = None\n",
    "model_critic_delay = None\n",
    "optimizer_actor = None\n",
    "optimizer_critic = None\n",
    "\n",
    "a2c"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "scrolled": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[0.0, -162.704345703125]"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from IPython import display\n",
    "import random\n",
    "\n",
    "\n",
    "#玩一局游戏并记录数据\n",
    "def play(show=False):\n",
    "    state = []\n",
    "    action = []\n",
    "    reward = []\n",
    "    next_state = []\n",
    "    over = []\n",
    "\n",
    "    s = env.reset()\n",
    "    o = False\n",
    "    while not o:\n",
    "        a = []\n",
    "        for i in range(2):\n",
    "            #计算动作\n",
    "            prob = a2c[i].model_actor(torch.FloatTensor(s[i]).reshape(\n",
    "                1, -1))[0].tolist()\n",
    "            a.append(random.choices(range(4), weights=prob, k=1)[0])\n",
    "\n",
    "        #执行动作\n",
    "        ns, r, o = env.step(a)\n",
    "\n",
    "        state.append(s)\n",
    "        action.append(a)\n",
    "        reward.append(r)\n",
    "        next_state.append(ns)\n",
    "        over.append(o)\n",
    "\n",
    "        s = ns\n",
    "\n",
    "        if show:\n",
    "            display.clear_output(wait=True)\n",
    "            env.show()\n",
    "\n",
    "    state = torch.FloatTensor(state)\n",
    "    action = torch.LongTensor(action).unsqueeze(-1)\n",
    "    reward = torch.FloatTensor(reward).unsqueeze(-1)\n",
    "    next_state = torch.FloatTensor(next_state)\n",
    "    over = torch.LongTensor(over).reshape(-1, 1)\n",
    "\n",
    "    return state, action, reward, next_state, over, reward.sum(\n",
    "        dim=0).flatten().tolist()\n",
    "\n",
    "\n",
    "state, action, reward, next_state, over, reward_sum = play()\n",
    "\n",
    "reward_sum"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "scrolled": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0 -1.5267943143844604 [9.0, -1073.5245361328125]\n"
     ]
    }
   ],
   "source": [
    "def train():\n",
    "    #训练N局\n",
    "    for epoch in range(500):\n",
    "        state, action, reward, next_state, over, _ = play()\n",
    "\n",
    "        for i in range(2):\n",
    "            value = a2c[i].train_critic(state[:, i], reward[:, i],\n",
    "                                        next_state[:, i], over)\n",
    "            loss = a2c[i].train_actor(state[:, i], action[:, i], value)\n",
    "\n",
    "        if epoch % 2500 == 0:\n",
    "            test_result = [play()[-1] for _ in range(20)]\n",
    "            test_result = torch.FloatTensor(test_result).mean(dim=0).tolist()\n",
    "            print(epoch, loss, test_result)\n",
    "\n",
    "\n",
    "train()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "scrolled": false
   },
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAR8AAAEYCAYAAABlUvL1AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjcuMSwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/bCgiHAAAACXBIWXMAAA9hAAAPYQGoP6dpAAAj40lEQVR4nO3de3BTZf4/8HcuTWgpSWhLE7q0gIpCl4vcLBF2na9kqdjxAl1H2IqVYWCEcK3w1TqIl3Upi7Ojy66UdVet812F3a6LSuWytUhBSAtUi6WwBQR+qUBaodukRZpe8vz+0J4lUqDp7Uns+zVzZsh5npN8nmPP25Nzcs5RCSEEiIh6mFp2AUTUOzF8iEgKhg8RScHwISIpGD5EJAXDh4ikYPgQkRQMHyKSguFDRFIwfIhICmnh8/rrr2PIkCHo06cPkpKScPDgQVmlEJEEUsLnb3/7GzIyMvD888/j888/x5gxY5CcnIzq6moZ5RCRBCoZF5YmJSVh4sSJ+OMf/wgA8Pl8iI+Px5IlS/DMM8/0dDlEJIG2pz+wsbERJSUlyMzMVOap1WrYbDY4HI42l/F6vfB6vcprn8+HmpoaREdHQ6VSdXvNRNQ+QgjU1dUhLi4OavWNv1j1ePhcvHgRLS0tMJvNfvPNZjP+/e9/t7lMVlYWXnzxxZ4oj4i6QGVlJQYNGnTDPj0ePh2RmZmJjIwM5bXb7UZCQgIqKythMBgkVkZEV/N4PIiPj0e/fv1u2rfHwycmJgYajQZVVVV+86uqqmCxWNpcRq/XQ6/XXzPfYDAwfIiCUHsOh/T42S6dTofx48ejoKBAmefz+VBQUACr1drT5RCRJFK+dmVkZCA9PR0TJkzAXXfdhddeew2XL1/G3LlzZZRDRBJICZ9HH30U33zzDdasWQOXy4U777wTO3fuvOYgNBH9eEn5nU9neTweGI1GuN1uHvMhCiKBbJu8touIpGD4EJEUDB8ikoLhQ0RSMHyISAqGDxFJwfAhIikYPkQkBcOHiKRg+BCRFAwfIpKC4UNEUjB8iEgKhg8RScHwISIpGD5EJAXDh4ikYPgQkRQMHyKSguFDRFIwfIhICoYPEUkRcPjs3bsXDzzwAOLi4qBSqfDBBx/4tQshsGbNGgwcOBDh4eGw2Ww4efKkX5+amhqkpaXBYDDAZDJh3rx5qK+v79RAiCi0BBw+ly9fxpgxY/D666+32b5+/Xps2LABmzZtQnFxMfr27Yvk5GQ0NDQofdLS0lBeXo78/Hzk5eVh7969WLBgQcdHQUShR3QCALF161bltc/nExaLRbzyyivKvNraWqHX68XmzZuFEEIcO3ZMABCHDh1S+uzYsUOoVCpx7ty5dn2u2+0WAITb7e5M+UTUxQLZNrv0mM+ZM2fgcrlgs9mUeUajEUlJSXA4HAAAh8MBk8mECRMmKH1sNhvUajWKi4vbfF+v1wuPx+M3EVFo69LwcblcAHDNM9fNZrPS5nK5EBsb69eu1WoRFRWl9PmhrKwsGI1GZYqPj+/KsolIgpA425WZmQm3261MlZWVsksiok7q0vCxWCwAgKqqKr/5VVVVSpvFYkF1dbVfe3NzM2pqapQ+P6TX62EwGPwmIgptXRo+Q4cOhcViQUFBgTLP4/GguLgYVqsVAGC1WlFbW4uSkhKlz+7du+Hz+ZCUlNSV5RBRENMGukB9fT1OnTqlvD5z5gxKS0sRFRWFhIQELF++HC+//DKGDRuGoUOH4rnnnkNcXBwefvhhAMCIESNw3333Yf78+di0aROampqwePFizJo1C3FxcV02MCIKcoGeSvv0008FgGum9PR0IcR3p9ufe+45YTabhV6vF1OnThUVFRV+73Hp0iUxe/ZsERkZKQwGg5g7d66oq6trdw081U4UnALZNlVCCCEx+zrE4/HAaDTC7Xbz+A9REAlk2wyJs11E9OPD8CEiKRg+RCQFw4eIpGD4EJEUDB8ikoLhQ0RSMHyISAqGDxFJwfAhIikYPkQkBcOHiKRg+BCRFAwfIpKC4UNEUjB8iEgKhg8RScHwISIpGD5EJAXDh4ikYPgQkRQMHyKSIqDwycrKwsSJE9GvXz/Exsbi4YcfRkVFhV+fhoYG2O12REdHIzIyEqmpqdc8PtnpdCIlJQURERGIjY3FqlWr0Nzc3PnREFHICCh8CgsLYbfbUVRUhPz8fDQ1NWHatGm4fPmy0mfFihXYtm0bcnNzUVhYiPPnz2PmzJlKe0tLC1JSUtDY2IgDBw7gnXfeQU5ODtasWdN1oyKi4NeZpxNWV1cLAKKwsFAIIURtba0ICwsTubm5Sp/jx48LAMLhcAghhNi+fbtQq9XC5XIpfbKzs4XBYBBer7ddn8snlhIFp0C2zU4d83G73QCAqKgoAEBJSQmamppgs9mUPsOHD0dCQgIcDgcAwOFwYNSoUTCbzUqf5ORkeDwelJeXt/k5Xq8XHo/HbyKi0Nbh8PH5fFi+fDkmT56MkSNHAgBcLhd0Oh1MJpNfX7PZDJfLpfS5Onha21vb2pKVlQWj0ahM8fHxHS2biIJEh8PHbrfj6NGj2LJlS1fW06bMzEy43W5lqqys7PbPJKLupe3IQosXL0ZeXh727t2LQYMGKfMtFgsaGxtRW1vrt/dTVVUFi8Wi9Dl48KDf+7WeDWvt80N6vR56vb4jpRJRkApoz0cIgcWLF2Pr1q3YvXs3hg4d6tc+fvx4hIWFoaCgQJlXUVEBp9MJq9UKALBarSgrK0N1dbXSJz8/HwaDAYmJiZ0ZCxGFkID2fOx2O9577z18+OGH6Nevn3KMxmg0Ijw8HEajEfPmzUNGRgaioqJgMBiwZMkSWK1WTJo0CQAwbdo0JCYmYs6cOVi/fj1cLhdWr14Nu93OvRui3iSQ02gA2pzefvttpc+VK1fEokWLRP/+/UVERISYMWOGuHDhgt/7nD17VkyfPl2Eh4eLmJgY8dRTT4mmpqZ218FT7UTBKZBtUyWEEPKir2M8Hg+MRiPcbjcMBoPscojoe4Fsm7y2i4ikYPgQkRQMHyKSguFDRFIwfIhICoYPEUnB8CEiKRg+RCQFw4eIpGD4EJEUDB8ikoLhQ0RSMHyISAqGDxFJwfAhIikYPkQkBcOHiKRg+BCRFAwfIpKC4UNEUjB8iEgKhg8RSRFQ+GRnZ2P06NEwGAwwGAywWq3YsWOH0t7Q0AC73Y7o6GhERkYiNTVVeRRyK6fTiZSUFERERCA2NharVq1Cc3Nz14yGiEJGQOEzaNAgrFu3DiUlJTh8+DDuvfdePPTQQygvLwcArFixAtu2bUNubi4KCwtx/vx5zJw5U1m+paUFKSkpaGxsxIEDB/DOO+8gJycHa9as6dpREVHw6+wTCvv37y/+8pe/iNraWhEWFiZyc3OVtuPHjwsAwuFwCCGE2L59u1Cr1cLlcil9srOzhcFgEF6vt92fySeWEgWnQLbNDh/zaWlpwZYtW3D58mVYrVaUlJSgqakJNptN6TN8+HAkJCTA4XAAABwOB0aNGgWz2az0SU5OhsfjUfae2uL1euHxePwmIgptAYdPWVkZIiMjodfr8eSTT2Lr1q1ITEyEy+WCTqeDyWTy6282m+FyuQAALpfLL3ha21vbricrKwtGo1GZ4uPjAy2biIJMwOFzxx13oLS0FMXFxVi4cCHS09Nx7Nix7qhNkZmZCbfbrUyVlZXd+nlE1P20gS6g0+lw2223AQDGjx+PQ4cO4fe//z0effRRNDY2ora21m/vp6qqChaLBQBgsVhw8OBBv/drPRvW2qcter0eer0+0FKJKIh1+nc+Pp8PXq8X48ePR1hYGAoKCpS2iooKOJ1OWK1WAIDVakVZWRmqq6uVPvn5+TAYDEhMTOxsKUQUQgLa88nMzMT06dORkJCAuro6vPfee9izZw927doFo9GIefPmISMjA1FRUTAYDFiyZAmsVismTZoEAJg2bRoSExMxZ84crF+/Hi6XC6tXr4bdbueeDVEvE1D4VFdX4/HHH8eFCxdgNBoxevRo7Nq1C7/4xS8AAK+++irUajVSU1Ph9XqRnJyMjRs3KstrNBrk5eVh4cKFsFqt6Nu3L9LT0/HSSy917aiIKOiphBBCdhGB8ng8MBqNcLvdMBgMssshou8Fsm3y2i4ikoLhQ0RSMHyISAqGDxFJwfAhIikYPkQkBcOHiKRg+BCRFAwfIpKC4UNEUjB8iEgKhg8RScHwISIpGD5EJAXDh4ikYPgQkRQMHyKSguFDRFIwfIhICoYPEUnB8CEiKRg+RCRFp8Jn3bp1UKlUWL58uTKvoaEBdrsd0dHRiIyMRGpqqvJI5FZOpxMpKSmIiIhAbGwsVq1ahebm5s6UQkQhpsPhc+jQIfzpT3/C6NGj/eavWLEC27ZtQ25uLgoLC3H+/HnMnDlTaW9paUFKSgoaGxtx4MABvPPOO8jJycGaNWs6PgoiCj2iA+rq6sSwYcNEfn6+uOeee8SyZcuEEELU1taKsLAwkZubq/Q9fvy4ACAcDocQQojt27cLtVotXC6X0ic7O1sYDAbh9Xrb9flut1sAEG63uyPlE1E3CWTb7NCej91uR0pKCmw2m9/8kpISNDU1+c0fPnw4EhIS4HA4AAAOhwOjRo2C2WxW+iQnJ8Pj8aC8vLzNz/N6vfB4PH4TEYW2gJ7VDgBbtmzB559/jkOHDl3T5nK5oNPpYDKZ/OabzWa4XC6lz9XB09re2taWrKwsvPjii4GWSkRBLKA9n8rKSixbtgzvvvsu+vTp0101XSMzMxNut1uZKisre+yziah7BBQ+JSUlqK6uxrhx46DVaqHValFYWIgNGzZAq9XCbDajsbERtbW1fstVVVXBYrEAACwWyzVnv1pft/b5Ib1eD4PB4DcRUWgLKHymTp2KsrIylJaWKtOECROQlpam/DssLAwFBQXKMhUVFXA6nbBarQAAq9WKsrIyVFdXK33y8/NhMBiQmJjYRcMiomAX0DGffv36YeTIkX7z+vbti+joaGX+vHnzkJGRgaioKBgMBixZsgRWqxWTJk0CAEybNg2JiYmYM2cO1q9fD5fLhdWrV8Nut0Ov13fRsIgo2AV8wPlmXn31VajVaqSmpsLr9SI5ORkbN25U2jUaDfLy8rBw4UJYrVb07dsX6enpeOmll7q6FCIKYiohhJBdRKA8Hg+MRiPcbjeP/xAFkUC2zS7f86GeI4RAVXUVvv32W2g0Gvwk7ifQavmflEID/1JDkBAClecqkbsnF7sv7Ma3qm+hgQZ3RtyJX035Fcb8dAw0Go3sMoluiOETYoQQ+PzY5/h1/q9RH1MPzRANdNABAI60HMGRwiN44sITmGWbxQCioMZbaoSYqm+q8Jtdv8G3cd9Co/cPF7VGDZVZhZzTOfjs0GeSKiRqH4ZPCBFC4IPCD+COdkOlUrXZR6VSQRWlwubizfB6vT1cIVH7MXxCiMfjwc7TO6EJv/HXKZVahRPqEyg7XtZDlREFjuETQpqbm3HFd+W6ez1Xa9G04PK3l3ugKqKOYfiEEK1Wiz7qPmjPT7PULWqE9wnvgaqIOobhE0IMBgN+MfgXaGlouWE/IQRubbkVoxNH37AfkUwMnxCiUqkw856ZiLwYed29HyEEfDU+zJ44u0dve0IUKIZPiBloGYjMqZnQX9DD1+jzaxMtAr5vfEgblIZ7ku6RVCFR+/BHhiFGpVIhaXQSXjO8hi17tmDfhX1oUDdAAw0SwxLxq7t/hYljJvIyCwp6vLA0hPl8Ppw7fw719fXQarUYMngIwsLCZJdFvRgvLO0l1Go14gfFyy6DqEN4zIeIpOCeDym8Xi+cZ8/i4ve3uI2MjMStd9yB8PDwdv2wkSgQDB/Cf2pq8OkHH+Ds/v2A0wl9YyMAoFmjwU6zGQOTkvA/qamIGzSIIURdhgecezEhBL6qqEDBH/8I/fHjiNBorgkXIQQafT7UxsTAumwZxiYlQa3mt3VqWyDbJv+KerFjpaXY8fTTMFRUoK9W2+ZejUqlgl6jwYCaGhQ9/zz25uW16/IOopth+PRSNRcvYs/GjYi9cgXaduzJqFUqDABwJCcHZ0+dYgBRpzF8eiGfz4eP/vxnmM6ehTqAYzgqlQqW+np8+MorvFcQdRrDpxdyu92oLi5GWAcOHqtVKoivvsL/O326Gyqj3iSg8HnhhRe+u1PeVdPw4cOV9oaGBtjtdkRHRyMyMhKpqanXPBrZ6XQiJSUFERERiI2NxapVq9Dc3Nw1o6F2OV5aikj39e+GeDPRAI7s3cuvXtQpAZ9q/+lPf4pPPvnkv29w1TVEK1aswMcff4zc3FwYjUYsXrwYM2fOxP79+wEALS0tSElJgcViwYEDB3DhwgU8/vjjCAsLw9q1a7tgONQep8vKYOrEzeX7aDT4uox3SaTOCTh8tFotLBbLNfPdbjfefPNNvPfee7j33nsBAG+//TZGjBiBoqIiTJo0Cf/6179w7NgxfPLJJzCbzbjzzjvx61//Gk8//TReeOEF6HS6zo+Ibqqz+yv8rQ91hYCP+Zw8eRJxcXG45ZZbkJaWBqfTCQAoKSlBU1MTbDab0nf48OFISEiAw+EAADgcDowaNQpms1npk5ycDI/Hg/Ly8ut+ptfrhcfj8Zuo48J0Ovhu3u26fEJAy/9RUCcFFD5JSUnIycnBzp07kZ2djTNnzuBnP/sZ6urq4HK5oNPpYDKZ/JYxm81wuVwAAJfL5Rc8re2tbdeTlZUFo9GoTPHxvJiyM0bdfTcu+joeP/XNzbh9ypQurIh6o4C+dk2fPl359+jRo5GUlITBgwfj73//O8LDu+9+wZmZmcjIyFBeezweBlAn3Hr77fjYYoG4eLFDX6FqwsIwfdIkfv2iTunUqXaTyYTbb78dp06dgsViQWNjI2pra/36VFVVKceILBbLNWe/Wl+3dRyplV6vh8Fg8Juo4/R6PUY9+CDqhQj4jJXX58OAn/0MloEDu6k66i06FT719fX46quvMHDgQIwfPx5hYWEoKChQ2isqKuB0OmG1WgEAVqsVZWVlqP7+qmkAyM/Ph8FgQGJiYmdKoQCoVCrYUlOh+vnP0RRA+PiEwKXBg/HL5ct5fRd1WkB/QStXrkRhYSHOnj2LAwcOYMaMGdBoNJg9ezaMRiPmzZuHjIwMfPrppygpKcHcuXNhtVoxadIkAMC0adOQmJiIOXPm4MiRI9i1axdWr14Nu90OvV7fLQOktoWFheHBhQvhGT4cdT7fTfeAvD4fzsfE4IH//V/07du3h6qkH7OAjvl8/fXXmD17Ni5duoQBAwZgypQpKCoqwoABAwAAr776KtRqNVJTU+H1epGcnIyNGzcqy2s0GuTl5WHhwoWwWq3o27cv0tPT8dJLL3XtqKhdomNikP7b3+Jf//d/OPOPfyCqpQXhV13ZLoRAk8+HGp8PkZMn44mMDJhMJh7roS7BW2oQWlpa8NWJEziybx+cDge8338t1kZGwnLXXRhzzz0YMXo0f4dFNxXItsnwIT8NDQ1o/P5mYlqtlncxpIDwBvLUYX369OHDBqlH8JQFEUnB8CEiKRg+RCQFw4eIpGD4EJEUDB8ikoLhQ0RSMHyISAqGDxFJwfAhIikYPkQkBcOHiKRg+BCRFAwfIpKC4UNEUjB8iEgKhg8RScHwISIpGD5EJAXDh4ikCDh8zp07h8ceewzR0dEIDw/HqFGjcPjwYaVdCIE1a9Zg4MCBCA8Ph81mw8mTJ/3eo6amBmlpaTAYDDCZTJg3bx7q6+s7PxoiChkBhc9//vMfTJ48GWFhYdixYweOHTuG3/3ud+jfv7/SZ/369diwYQM2bdqE4uJi9O3bF8nJyWhoaFD6pKWloby8HPn5+cjLy8PevXuxYMGCrhsVEQW9gJ7b9cwzz2D//v3Yt29fm+1CCMTFxeGpp57CypUrAQButxtmsxk5OTmYNWsWjh8/jsTERBw6dAgTJkwAAOzcuRP3338/vv76a8TFxd20jq54bpcQAs3NzaipqUFpaSnq6+tx4MABXL582a/fyJEjMWjQIAwbNgyDBg2CwWDgc6yIrqPbntv10UcfITk5GY888ggKCwvxk5/8BIsWLcL8+fMBAGfOnIHL5YLNZlOWMRqNSEpKgsPhwKxZs+BwOGAymZTgAQCbzQa1Wo3i4mLMmDHjms/1er3wer1+A+yI1sCprKzEvn37UFBQgBMnTqChoQE+nw9a7bWro6ioCD6fDzqdDlFRUUhKSoLNZsOYMWNgNBoZREQdFFD4nD59GtnZ2cjIyMCzzz6LQ4cOYenSpdDpdEhPT4fL5QIAmM1mv+XMZrPS5nK5EBsb61+EVouoqCilzw9lZWXhxRdfDKRUP62h88UXXyAnJweHDx9GY2Mj1Go1VCoVNBoNNBpNm8u2tgkhcPHiReTl5SEvLw8DBgzAL3/5S8yYMQMxMTEMIaIABXTMx+fzYdy4cVi7di3Gjh2LBQsWYP78+di0aVN31QcAyMzMhNvtVqbKysp2LyuEgNPpREZGBhYtWoSioiK0tLRAo9EEHBgqlQpqtRpqtRoXL15EdnY20tLSkJub63dMi4huLqDwGThwIBITE/3mjRgxAk6nEwBgsVgAAFVVVX59qqqqlDaLxYLq6mq/9tZjL619fkiv18NgMPhN7dHU1IT3338fTzzxBPbv3w8hRJftobQG0cWLF/Hb3/4WTz/9NE6fPo0ADqER9WoBhc/kyZNRUVHhN+/EiRMYPHgwAGDo0KGwWCwoKChQ2j0eD4qLi2G1WgEAVqsVtbW1KCkpUfrs3r0bPp8PSUlJHR7I1YQQ+Oabb7Bu3TpkZWXB7XZ329ei1vfdu3cvli5dqoyFiG4soPBZsWIFioqKsHbtWpw6dQrvvfce3njjDdjtdgDfbYjLly/Hyy+/jI8++ghlZWV4/PHHERcXh4cffhjAd3tK9913H+bPn4+DBw9i//79WLx4MWbNmtWuM1030xo8K1euxD//+U+lru6mVqtx/vx5PPvss3j//fcZQEQ3EdCpdgDIy8tDZmYmTp48iaFDhyIjI0M52wV8t/E///zzeOONN1BbW4spU6Zg48aNuP3225U+NTU1WLx4MbZt2wa1Wo3U1FRs2LABkZGR7arheqfzWoNn1apVKCsrk3IQWAgBnU6HlStXYubMmVCr+SNy6j0COdUecPgEg+sN0OPxYOnSpfjyyy+lnn1qDaBnn30WDzzwAM+EUa8RSPj8aP633NTUhDfffFN68ADffc1rbGxEdnY2Tpw4IbUWomD1owgfIQT27duHd999V3rwtFKpVKiqqsIrr7zC69aI2vCjCJ9vvvkGr732WtAd5FWpVCgpKcE//vGPoKuNSLaQDx8hBN566y1UVlYGzV7P1VQqFd58882AfhhJ1BuEfPg4nU5s3749aM8qqVQq1NfXY/Pmzdz7IbpKcG6x7SSEwObNm1FXVye7lBtSqVT4+OOP8fXXX8suhShohHT41NfX47PPPgvavZ5WKpUKdXV12L9/v+xSiIJGcG+1N1FRUYHz58/LLqNd1Go1du/ejebmZtmlEAWFkA6fPXv2hMyFnCqVCuXl5SETlkTdLaTD5/jx40H/letqly9fxqlTp2SXQRQUQmfLbcOZM2dklxAQjUaD4uJi2WUQBYWQDp9QPHXtdrtDsm6irhbS4dPY2Ci7hIAdOXIkJOsm6mohHT6hRqVShcwBcqLuxvAhIikYPkQkBcOnBwkhOvTUDKIfo5AOH51OJ7uEgI0bNy4k6ybqaiEdPtd70F+wEkIgMjKSez5ECPHwueWWW2SXEBCfz4fJkyfLLoMoKIR0+IwZMwYtLS2yy2g3g8GAW2+9VXYZREEhpMPn5z//ech89RJCYMyYMdc8p56otwrp8LntttuQkJAgu4x28fl8sNlsIROWRN0tpMMnPDwc9957b9BfKyWEQFRUFCZNmiS7FKKgEdLho1Kp8Mgjj6B///6yS7khIQRmzJgBs9ksuxSioKGVXUBHtF4f5fF40K9fP9hsNmzevDlov9KEh4cjOTk56O81TdRZHo8HANp1DWNIhs+lS5cAAPHx8ZIrab877rhDdglEPaaurg5Go/GGfUIyfKKiogB899icmw3wx8zj8SA+Ph6VlZU3fS72jxnXw3eCYT0IIVBXV4e4uLib9g3J8Gm9darRaOzVf2ytDAYD1wO4HlrJXg/t3SEI6QPORBS6GD5EJEVIho9er8fzzz8PvV4vuxSpuB6+w/XwnVBbDyrB+3oSkQQhuedDRKGP4UNEUjB8iEgKhg8RSRGS4fP6669jyJAh6NOnD5KSknDw4EHZJXWZrKwsTJw4Ef369UNsbCwefvhhVFRU+PVpaGiA3W5HdHQ0IiMjkZqaiqqqKr8+TqcTKSkpiIiIQGxsLFatWoXm5uaeHEqXWrduHVQqFZYvX67M6y3r4dy5c3jssccQHR2N8PBwjBo1CocPH1bahRBYs2YNBg4ciPDwcNhsNpw8edLvPWpqapCWlgaDwQCTyYR58+ahvr6+p4fiT4SYLVu2CJ1OJ9566y1RXl4u5s+fL0wmk6iqqpJdWpdITk4Wb7/9tjh69KgoLS0V999/v0hISBD19fVKnyeffFLEx8eLgoICcfjwYTFp0iRx9913K+3Nzc1i5MiRwmaziS+++EJs375dxMTEiMzMTBlD6rSDBw+KIUOGiNGjR4tly5Yp83vDeqipqRGDBw8WTzzxhCguLhanT58Wu3btEqdOnVL6rFu3ThiNRvHBBx+II0eOiAcffFAMHTpUXLlyRelz3333iTFjxoiioiKxb98+cdttt4nZs2fLGJIi5MLnrrvuEna7XXnd0tIi4uLiRFZWlsSquk91dbUAIAoLC4UQQtTW1oqwsDCRm5ur9Dl+/LgAIBwOhxBCiO3btwu1Wi1cLpfSJzs7WxgMBuH1ent2AJ1UV1cnhg0bJvLz88U999yjhE9vWQ9PP/20mDJlynXbfT6fsFgs4pVXXlHm1dbWCr1eLzZv3iyEEOLYsWMCgDh06JDSZ8eOHUKlUolz5851X/E3EVJfuxobG1FSUgKbzabMU6vVsNlscDgcEivrPm63G8B/L6YtKSlBU1OT3zoYPnw4EhISlHXgcDgwatQov/sHJScnw+PxoLy8vAer7zy73Y6UlBS/8QK9Zz189NFHmDBhAh555BHExsZi7Nix+POf/6y0nzlzBi6Xy289GI1GJCUl+a0Hk8mECRMmKH1sNhvUajWKi4t7bjA/EFLhc/HiRbS0tFxzUy6z2QyXyyWpqu7j8/mwfPlyTJ48GSNHjgQAuFwu6HQ6mEwmv75XrwOXy9XmOmptCxVbtmzB559/jqysrGvaest6OH36NLKzszFs2DDs2rULCxcuxNKlS/HOO+8A+O84brRNuFyua+4drtVqERUVJXU9hORV7b2F3W7H0aNH8dlnn8kupcdVVlZi2bJlyM/PR58+fWSXI43P58OECROwdu1aAMDYsWNx9OhRbNq0Cenp6ZKr65yQ2vOJiYmBRqO55oxGVVUVLBaLpKq6x+LFi5GXl4dPP/0UgwYNUuZbLBY0NjaitrbWr//V68BisbS5jlrbQkFJSQmqq6sxbtw4aLVaaLVaFBYWYsOGDdBqtTCbzb1iPQwcOBCJiYl+80aMGAGn0wngv+O40TZhsVhQXV3t197c3Iyamhqp6yGkwken02H8+PEoKChQ5vl8PhQUFMBqtUqsrOsIIbB48WJs3boVu3fvxtChQ/3ax48fj7CwML91UFFRAafTqawDq9WKsrIyvz+4/Px8GAyGa/6Qg9XUqVNRVlaG0tJSZZowYQLS0tKUf/eG9TB58uRrfmpx4sQJDB48GAAwdOhQWCwWv/Xg8XhQXFzstx5qa2tRUlKi9Nm9ezd8Ph+SkpJ6YBTXIe1Qdwdt2bJF6PV6kZOTI44dOyYWLFggTCaT3xmNULZw4UJhNBrFnj17xIULF5Tp22+/Vfo8+eSTIiEhQezevVscPnxYWK1WYbValfbWU8zTpk0TpaWlYufOnWLAgAEhdYq5LVef7RKid6yHgwcPCq1WK37zm9+IkydPinfffVdERESIv/71r0qfdevWCZPJJD788EPx5ZdfioceeqjNU+1jx44VxcXF4rPPPhPDhg3jqfaO+MMf/iASEhKETqcTd911lygqKpJdUpcB0Ob09ttvK32uXLkiFi1aJPr37y8iIiLEjBkzxIULF/ze5+zZs2L69OkiPDxcxMTEiKeeeko0NTX18Gi61g/Dp7esh23btomRI0cKvV4vhg8fLt544w2/dp/PJ5577jlhNpuFXq8XU6dOFRUVFX59Ll26JGbPni0iIyOFwWAQc+fOFXV1dT05jGvwlhpEJEVIHfMhoh8Phg8RScHwISIpGD5EJAXDh4ikYPgQkRQMHyKSguFDRFIwfIhICoYPEUnB8CEiKRg+RCTF/wcmmaU6R7MswgAAAABJRU5ErkJggg==",
      "text/plain": [
       "<Figure size 300x300 with 1 Axes>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/plain": [
       "[460.0, -462.07421875]"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "play(True)[-1]"
   ]
  }
 ],
 "metadata": {
  "colab": {
   "collapsed_sections": [],
   "name": "第9章-策略梯度算法.ipynb",
   "provenance": []
  },
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.0"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}
